Skip to content

JuliaSyntax parser-based REPL completions overhaul #57767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 22, 2025

Conversation

xal-0
Copy link
Member

@xal-0 xal-0 commented Mar 13, 2025

Overview

As we add REPL features, bugs related to the ad-hoc parsing done by
REPLCompletions.completions have crept in. This pull request replaces most of
the manual parsing (regex, find_start_brace) with a new approach that parses
the entire input buffer once, before and after the cursor, using JuliaSyntax.
We then query the parsed syntax tree to determine the kind of completion
to be done.

Changes

Future work

  • Method completions could be changed to look for methods with exactly the given
    number of arguments if the closing ) is present, and search for signatures
    with the right prefix otherwise.

    • It would be nice to be able to search by type as well as value (perhaps
      by putting ::T in place of arguments).
  • Other REPL features could benefit from JuliaSyntax, so it might be worth
    sharing the parse tree between completions and other features:

    • Emacs-style sexpr navigation: C-M-f/C-M-b/C-M-u, etc.
    • Improved auto-indent.
  • It would be nice if hints worked even when the cursor is between text.

  • CursorNode is a slightly tweaked copy of SyntaxNode from JuliaSyntax that
    tracks the parent node but includes all trivia. It is used with seek_pos,
    which navigates to the innermost node at a given position so we can examine
    nearby nodes and the parent. This could probably duplicate less code from
    JuliaSyntax.

@xal-0 xal-0 added bugfix This change fixes an existing bug completions Tab and autocompletion in the repl labels Mar 13, 2025
@xal-0 xal-0 requested a review from vtjnash March 13, 2025 23:38
@xal-0 xal-0 force-pushed the juliasyntax-repl branch from 385cc7f to b62b012 Compare March 13, 2025 23:40
xal-0 added 2 commits March 13, 2025 16:49
Adds another permitted return type for complete_line, where the second element
of the tuple is a Region (a Pair{Int, Int}) describing the region of text to be
replaced.  This is useful for making completions work consistently when the
closing delimiter may or may not be present: the cursor can be made to "jump"
out of the delimiters regardless of whether it is there already.

  "exam|    =TAB=>   "example.jl"|
  "exam|"   =TAB=>   "example.jl"|
This commit replaces the heuristic parsing done by REPLCompletions.completions
with a new approach that parses the entire input buffer once with JuliaSyntax.
In addition to fixing bugs, the more precise parsing should allow for new
features in the future.

Some features now work in more situations "for free", like dictionary key
completion (the expression evaluated to find the keys is now more precise) and
method suggestions (arguments beyond the cursor can be used to narrow the list).

The tests have been updated to reflect slightly differing behaviour for string
and Cmd-string completion: the new code returns a character range encompassing
the entire string when completing paths (not observable by the user), and the
behaviour of '~'-expansion has be tweaked to be consistent across all places
where paths can be completed.  Some escaping issues have also been fixed.

Fixes: JuliaLang#55420, JuliaLang#55518, JuliaLang#55520, JuliaLang#55842, JuliaLang#56389, JuliaLang#57611
@xal-0 xal-0 force-pushed the juliasyntax-repl branch from b62b012 to c539ec4 Compare March 13, 2025 23:49
@xal-0 xal-0 added the don't squash Don't squash merge label Mar 14, 2025
@IanButterworth
Copy link
Member

This sounds great! I think we should backport it to at least 1.12 given it fixes so many issues.

@IanButterworth IanButterworth added the backport 1.12 Change should be backported to release-1.12 label Mar 14, 2025
Copy link
Member

@giordano giordano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xal-0 first of all, speaking as the author of most of the issues linked to this PR, thank you for tackling this!

Also, can you please add a test for #57772 as well? I verified this works already (which is amazing!)

@StefanKarpinski
Copy link
Member

Bravo. Truly excellent quality-of-life PR 👏🏻

@xal-0 xal-0 removed the don't squash Don't squash merge label Mar 14, 2025
@giordano
Copy link
Member

Bravo. Truly excellent quality-of-life PR 👏🏻

Best part is that this PR has a negative diff, depsite the fact it added many tests.

@xal-0
Copy link
Member Author

xal-0 commented Mar 26, 2025

I am going to disable the shell completion tests until the shell mode can parse Windows paths...

shell> cd C:\Users
ERROR: IOError: cd("C:Users"): no such file or directory (ENOENT)
Stacktrace:
 [1] uv_error
   @ .\libuv.jl:106 [inlined]
 [2] cd(dir::String)
   @ Base.Filesystem .\file.jl:91
 [3] repl_cmd(cmd::Cmd, out::Base.TTY)
   @ Base .\client.jl:64
 [4] top-level scope
   @ none:1

xal-0 added 3 commits March 25, 2025 17:32
Also cleans up do_cmd_escape, so that it can use different escaping syntax from
the shell mode (which we may want to make similar to cmd.exe on Windows).
@xal-0 xal-0 force-pushed the juliasyntax-repl branch from 60dec58 to d48fd5e Compare March 26, 2025 19:48
@vtjnash
Copy link
Member

vtjnash commented Mar 28, 2025

until the shell mode can parse Windows paths

The shell mode has no problem with Windows paths which is why the REPL has tests for it. It just sounds like you lost a call to Base.shell_escape on that code path.

end

function shell_completions(string, pos, hint::Bool=false)
function shell_completions(string, pos, hint::Bool=false; cmd_escape::Bool=false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we always use the shell parser, why would we ever want this set true? I believe in the past, the code decided between whether it was completing one argument or several arguments, with escaping thus either wanting to include or exclude spaces. And thus choosing to fail if the input contained space-separate arguments rather than handling them, so I don't know why that would ever be useful.

# escape_raw_string with delim='`' and ignoring the rule for the ending \
return replace(s, r"(\\+)`" => s"\1\\`")
function do_cmd_escape(s)
return Base.shell_escape_posixly(Base.escape_raw_string(s, '`'))
Copy link
Member

@vtjnash vtjnash Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like nonsense: I can't think of any case where we'd expect it'd be useful for the the REPL to pass the string unmodified to a posix shell for use as an argument in a julia script, which is what this transform order implements.

The previous transform took a string text that was already in a valid posix-shell form (e.g. for raw input to shell_parse for shell> text mode) and corrects it for the julia parser (e.g. for :(`text`)). Usually our shell_escape_posixly attempts to add quotes to avoid this edge case complexity, but if we have chosen to preserve the user-typed syntax then this is needed

julia> a = `\\ b\\\\`
`'\' 'b\'`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended cmd_escape to make it possible to complete shell commands inside Cmd-strings, but mixed up the order (Base.escape_raw_string(Base.shell_escape_posixly(s), '`') does what I want I think):

julia> println(R.do_cmd_escape("file ` 1"))
'file \` 1'

julia> @macroexpand `'file \` 1'`
:(Base.cmd_gen((("file ` 1",),)))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completion with shell_escape_posixly is still unwieldy because you get partial completions that insert only the opening quote, and can't complete any further.

@KristofferC KristofferC mentioned this pull request Mar 31, 2025
36 tasks
@KristofferC KristofferC mentioned this pull request Apr 4, 2025
51 tasks
@IanButterworth
Copy link
Member

What's the status of this? I'm just wondering whether it's worth getting it in and continuing work in PRs?

@xal-0
Copy link
Member Author

xal-0 commented Apr 14, 2025

With the REPL completions working well enough on Windows again, this is ready for review again. 👍

@vtjnash vtjnash merged commit ff0a931 into JuliaLang:master Apr 22, 2025
8 checks passed
@KristofferC
Copy link
Member

🎤 ⬇️

@IanButterworth
Copy link
Member

As someone who tried and failed to fix the issues this closes by patching the old approach, thank you for fixing this properly!

KristofferC pushed a commit that referenced this pull request Apr 22, 2025
# Overview

As we add REPL features, bugs related to the ad-hoc parsing done by
`REPLCompletions.completions` have crept in. This pull request replaces
most of the manual parsing (regex, `find_start_brace`) with a new
approach that parses the entire input buffer once, before and after the
cursor, using JuliaSyntax. We then query the parsed syntax tree to
determine the kind of completion to be done.

# Changes
- New, JuliaSyntax-based completions mechanism.

- The `complete_line` interface now has the option of replacing
  arbitrary regions of text in the input buffer by returning a `Region`
  (`Pair{Int, Int}` for consistency with the convention in LineEdit, and
  `pos` being a 0-based byte offset).

- Fixes parsing-related bugs:
  - fix #55420
  - fix #55429
  - fix #55518
  - fix #55520
  - fix #55842
  - fix #56389
  - fix #57307
  - fix #57611
  - fix #57624
  - fix #58099

- Fixes some bugs that exist on 28d3bd5 that were found by fuzzing:
  - `x \"` + `TAB` throws a `FieldError` exception
  - String completion would sometimes delete the entire input buffer.
  - Completions should not happen inside comments.

- The duplicate code for path completion in strings, `Cmd`-strings, and
  the shell has been removed, causing paths to complete the same way for
  all three. Now, `~` is expanded in two situations:
  - If `foo` exists, or if `foo` does not exist but there are no
    possible completions:
    ```
    "~/foo/b|"     =TAB=>   "~/foo/bar|"
    "~/foo/bar|"   =TAB=>   "/home/user/foo/bar|"
       OR
    "~/foo/bar"|   =TAB=>   "/home/user/foo/bar"|
    ```

  - If the current path ends with a `/` and you hit TAB again:
    ```
    "~/foo/|"      =TAB=>   "/home/user/foo/|"
       OR
    "~/foo/"|      =TAB=>   "/home/user/foo/"|
    ```

# Future work
- Method completions could be changed to look for methods with exactly
  the given number of arguments if the closing `)` is present, and search
  for signatures with the right prefix otherwise.

- It would be nice to be able to search by type as well as value
  (perhaps by putting `::T` in place of arguments).

- Other REPL features could benefit from JuliaSyntax, so it might be
  worth sharing the parse tree between completions and other features:

    - Emacs-style sexpr navigation: `C-M-f`/`C-M-b`/`C-M-u`, etc.
    - Improved auto-indent.

- It would be nice if hints worked even when the cursor is between text.

- `CursorNode` is a slightly tweaked copy of `SyntaxNode` from
  JuliaSyntax that tracks the parent node but includes all trivia. It is
  used with `seek_pos`, which navigates to the innermost node at a given
  position so we can examine nearby nodes and the parent. This could
  probably duplicate less code from JuliaSyntax.

(cherry picked from commit ff0a931)
@KristofferC KristofferC mentioned this pull request Apr 29, 2025
53 tasks
LebedevRI pushed a commit to LebedevRI/julia that referenced this pull request May 2, 2025
# Overview

As we add REPL features, bugs related to the ad-hoc parsing done by
`REPLCompletions.completions` have crept in. This pull request replaces
most of the manual parsing (regex, `find_start_brace`) with a new
approach that parses the entire input buffer once, before and after the
cursor, using JuliaSyntax. We then query the parsed syntax tree to
determine the kind of completion to be done.

# Changes
- New, JuliaSyntax-based completions mechanism.

- The `complete_line` interface now has the option of replacing
  arbitrary regions of text in the input buffer by returning a `Region`
  (`Pair{Int, Int}` for consistency with the convention in LineEdit, and
  `pos` being a 0-based byte offset).

- Fixes parsing-related bugs:
  - fix JuliaLang#55420
  - fix JuliaLang#55429
  - fix JuliaLang#55518
  - fix JuliaLang#55520
  - fix JuliaLang#55842
  - fix JuliaLang#56389
  - fix JuliaLang#57307
  - fix JuliaLang#57611
  - fix JuliaLang#57624
  - fix JuliaLang#58099

- Fixes some bugs that exist on 28d3bd5 that were found by fuzzing:
  - `x \"` + `TAB` throws a `FieldError` exception
  - String completion would sometimes delete the entire input buffer.
  - Completions should not happen inside comments.

- The duplicate code for path completion in strings, `Cmd`-strings, and
  the shell has been removed, causing paths to complete the same way for
  all three. Now, `~` is expanded in two situations:
  - If `foo` exists, or if `foo` does not exist but there are no
    possible completions:
    ```
    "~/foo/b|"     =TAB=>   "~/foo/bar|"
    "~/foo/bar|"   =TAB=>   "/home/user/foo/bar|"
       OR
    "~/foo/bar"|   =TAB=>   "/home/user/foo/bar"|
    ```

  - If the current path ends with a `/` and you hit TAB again:
    ```
    "~/foo/|"      =TAB=>   "/home/user/foo/|"
       OR
    "~/foo/"|      =TAB=>   "/home/user/foo/"|
    ```

# Future work
- Method completions could be changed to look for methods with exactly
  the given number of arguments if the closing `)` is present, and search
  for signatures with the right prefix otherwise.

- It would be nice to be able to search by type as well as value
  (perhaps by putting `::T` in place of arguments).

- Other REPL features could benefit from JuliaSyntax, so it might be
  worth sharing the parse tree between completions and other features:

    - Emacs-style sexpr navigation: `C-M-f`/`C-M-b`/`C-M-u`, etc.
    - Improved auto-indent.

- It would be nice if hints worked even when the cursor is between text.

- `CursorNode` is a slightly tweaked copy of `SyntaxNode` from
  JuliaSyntax that tracks the parent node but includes all trivia. It is
  used with `seek_pos`, which navigates to the innermost node at a given
  position so we can examine nearby nodes and the parent. This could
  probably duplicate less code from JuliaSyntax.
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment